Vol 9 no.1 2009

Mouath A. HOURANI, Ibrahiem M. M. EL EMARY

Faculty of Information Technology, Al Ahliyya Amman University, Amman 19328 Jordan Faculty of Engineering, Al Ahliyya Amman University, Amman 19328 Jordan e-mail: 1mouath.hourani@gmail.com, omary57@hotmail.com

Abstract

    In this paper, an innovative missing value estimation algorithm called Linear Stepwise Regression (LSR) is presented which uses multiple correlated-based samples imputation matrices for the final prediction of missing values. The matrices are computed and optimized using linear stepwise regression and linear programming methods. The performance of the LSR impute method, assessed over five different data sets, has been compared with four imputing approaches, namely KNN, LSS, LSimpute3 and LSimpute5 impute methods. Testing results reveal that the LSR impute has outstanding prediction ability in the estimation of the missing values problem for some data sets and is robust against the increasing rate of missing values. A comprehensive comparison of NRMSE on five data sets shows that the LSR impute performs comparative with, if not better than, the other missing value estimation methods in this area, and when complemented with other leading methods, it appears to be a proper solution to the missing value estimation in gene expression profile. Finally, our LSR method is applicable over other various non-bioinformatics data.

Full Text:

PDF